hamiltonian descent
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Hamilton (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (3 more...)
Hamiltonian descent for composite objectives
In optimization the duality gap between the primal and the dual problems is a measure of the suboptimality of any primal-dual point. In classical mechanics the equations of motion of a system can be derived from the Hamiltonian function, which is a quantity that describes the total energy of the system. In this paper we consider a convex optimization problem consisting of the sum of two convex functions, sometimes referred to as a composite objective, and we identify the duality gap to be the `energy' of the system. In the Hamiltonian formalism the energy is conserved, so we add a contractive term to the standard equations of motion so that this energy decreases linearly (ie, geometrically) with time. This yields a continuous-time ordinary differential equation (ODE) in the primal and dual variables which converges to zero duality gap, ie, optimality.
- Asia > Middle East > Jordan (0.04)
- North America > Canada > Ontario > Hamilton (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (3 more...)
Reviews: Hamiltonian descent for composite objectives
This paper was a borderline case that led to significant discussion among the reviewers, but in the end we decided the novel ODE-style approach would be of strong interest to the NeurIPS community. In your revision, please address all promises in the rebuttal and comments in the reviews. If there are additional experiments or examples you can include to demonstrate practical value or application of this technique to machine learning tasks, please include these in the revision to better demonstrate where your method might be used---this was the main weakness identified by the reviewers.
Hamiltonian descent for composite objectives
In optimization the duality gap between the primal and the dual problems is a measure of the suboptimality of any primal-dual point. In classical mechanics the equations of motion of a system can be derived from the Hamiltonian function, which is a quantity that describes the total energy of the system. In this paper we consider a convex optimization problem consisting of the sum of two convex functions, sometimes referred to as a composite objective, and we identify the duality gap to be the energy' of the system. In the Hamiltonian formalism the energy is conserved, so we add a contractive term to the standard equations of motion so that this energy decreases linearly (ie, geometrically) with time. This yields a continuous-time ordinary differential equation (ODE) in the primal and dual variables which converges to zero duality gap, ie, optimality.
Hamiltonian descent for composite objectives
O', Donoghue, Brendan, Maddison, Chris J.
In optimization the duality gap between the primal and the dual problems is a measure of the suboptimality of any primal-dual point. In classical mechanics the equations of motion of a system can be derived from the Hamiltonian function, which is a quantity that describes the total energy of the system. In this paper we consider a convex optimization problem consisting of the sum of two convex functions, sometimes referred to as a composite objective, and we identify the duality gap to be the energy' of the system. In the Hamiltonian formalism the energy is conserved, so we add a contractive term to the standard equations of motion so that this energy decreases linearly (ie, geometrically) with time. This yields a continuous-time ordinary differential equation (ODE) in the primal and dual variables which converges to zero duality gap, ie, optimality.
Hamiltonian Descent Methods
Maddison, Chris J., Paulin, Daniel, Teh, Yee Whye, O'Donoghue, Brendan, Doucet, Arnaud
We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger class includes functions whose second derivatives may be singular or unbounded at their minima. Our methods are discretizations of conformal Hamiltonian dynamics, which generalize the classical momentum method to model the motion of a particle with non-standard kinetic energy exposed to a dissipative force and the gradient field of the function of interest. They are first-order in the sense that they require only gradient computation. Yet, crucially the kinetic gradient map can be designed to incorporate information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex. We study in detail one implicit and two explicit methods. For one explicit method, we provide conditions under which it converges to stationary points of non-convex functions. For all, we provide conditions on the convex function and kinetic energy pair that guarantee linear convergence, and show that these conditions can be satisfied by functions with power growth. In sum, these methods expand the class of convex functions on which linear convergence is possible with first-order computation.
- Asia > Middle East > Jordan (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- (7 more...)
- Research Report (0.50)
- Instructional Material > Course Syllabus & Notes (0.45)